machine comprehension
MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models
Machine Comprehension (MC) is one of the core problems in natural language processing, requiring both understanding of the natural language and knowledge about the world. Rapid progress has been made since the release of several benchmark datasets, and recently the state-of-the-art models even surpass human performance on the well-known SQuAD evaluation. In this paper, we transfer knowledge learned from machine comprehension to the sequence-to-sequence tasks to deepen the understanding of the text. We propose MacNet: a novel encoder-decoder supplementary architecture to the widely used attention-based sequence-to-sequence models. Experiments on neural machine translation (NMT) and abstractive text summarization show that our proposed framework can significantly improve the performance of the baseline models, and our method for the abstractive text summarization achieves the state-of-the-art results on the Gigaword dataset.
MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models
Boyuan Pan, Yazheng Yang, Hao Li, Zhou Zhao, Yueting Zhuang, Deng Cai, Xiaofei He
Machine comprehension (MC) has gained significant popularity over the past few years and it is a coveted goal in the field of natural language understanding. Its task is to teach the machine to understand thecontent ofagivenpassage andthenanswer arelated question, which requires deep comprehension and accurate information extraction towards the text.
- North America > Canada > Quebec > Montreal (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.34)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.32)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)
- Asia > Singapore (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models
Machine Comprehension (MC) is one of the core problems in natural language processing, requiring both understanding of the natural language and knowledge about the world. Rapid progress has been made since the release of several benchmark datasets, and recently the state-of-the-art models even surpass human performance on the well-known SQuAD evaluation. In this paper, we transfer knowledge learned from machine comprehension to the sequence-to-sequence tasks to deepen the understanding of the text. We propose MacNet: a novel encoder-decoder supplementary architecture to the widely used attention-based sequence-to-sequence models. Experiments on neural machine translation (NMT) and abstractive text summarization show that our proposed framework can significantly improve the performance of the baseline models, and our method for the abstractive text summarization achieves the state-of-the-art results on the Gigaword dataset.
- Asia > Middle East > Lebanon (0.05)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- Asia > Singapore (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
Reviews: MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models
Update after Author Feedback: After reading all the reviews and the author feedback, I have two overall comments. The paper is branded as a transfer learning paper, but I'm left disappointed in this respect. I find it very surprising that the attention can be transferred at all, but it is such a small contribution to the MacNet Architecture's overall improvements, that it seems a hard sell. Focal losses have been used before and encoders have been transferred before, but they also contribute to performance improvements... Second comment: the ablations on summarization are necessary for a camera-ready version -- that seems like a hole right now, so I hope they are included in future versions. Overall, I'm still a 6 because you find a combination of things (with some surprising novelty) that improve performance, and it has shown that I should experiment with those things in the future.
Learning to Answer Multilingual and Code-Mixed Questions
Question-answering (QA) that comes naturally to humans is a critical component in seamless human-computer interaction. It has emerged as one of the most convenient and natural methods to interact with the web and is especially desirable in voice-controlled environments. Despite being one of the oldest research areas, the current QA system faces the critical challenge of handling multilingual queries. To build an Artificial Intelligent (AI) agent that can serve multilingual end users, a QA system is required to be language versatile and tailored to suit the multilingual environment. Recent advances in QA models have enabled surpassing human performance primarily due to the availability of a sizable amount of high-quality datasets. However, the majority of such annotated datasets are expensive to create and are only confined to the English language, making it challenging to acknowledge progress in foreign languages. Therefore, to measure a similar improvement in the multilingual QA system, it is necessary to invest in high-quality multilingual evaluation benchmarks. In this dissertation, we focus on advancing QA techniques for handling end-user queries in multilingual environments. This dissertation consists of two parts. In the first part, we explore multilingualism and a new dimension of multilingualism referred to as code-mixing. Second, we propose a technique to solve the task of multi-hop question generation by exploiting multiple documents. Experiments show our models achieve state-of-the-art performance on answer extraction, ranking, and generation tasks on multiple domains of MQA, VQA, and language generation. The proposed techniques are generic and can be widely used in various domains and languages to advance QA systems.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.27)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
- Asia > India > Himachal Pradesh > Shimla (0.04)
- (51 more...)
- Summary/Review (1.00)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Research Report > Experimental Study (0.92)
- Media > Film (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Transportation > Ground (0.92)
- (6 more...)
Modular Approach to Machine Reading Comprehension: Mixture of Task-Aware Experts
Rayasam, Anirudha, Kamath, Anusha, Kalejaiye, Gabriel Bayomi Tinoco
In this work we present a Mixture of Task-Aware Experts Network for Machine Reading Comprehension on a relatively small dataset. We particularly focus on the issue of common-sense learning, enforcing the common ground knowledge by specifically training different expert networks to capture different kinds of relationships between each passage, question and choice triplet. Moreover, we take inspi ration on the recent advancements of multitask and transfer learning by training each network a relevant focused task. By making the mixture-of-networks aware of a specific goal by enforcing a task and a relationship, we achieve state-of-the-art results and reduce over-fitting.
A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP
Zaib, Munazza, Sheng, Quan Z., Zhang, Wei Emma
Building a dialogue system that can communicate naturally with humans is a challenging yet interesting problem of agent-based computing. The rapid growth in this area is usually hindered by the long-standing problem of data scarcity as these systems are expected to learn syntax, grammar, decision making, and reasoning from insufficient amounts of task-specific dataset. The recently introduced pre-trained language models have the potential to address the issue of data scarcity and bring considerable advantages by generating contextualized word embeddings. These models are considered counterpart of ImageNet in NLP and have demonstrated to capture different facets of language such as hierarchical relations, long-term dependency, and sentiment. In this short survey paper, we discuss the recent progress made in the field of pre-trained language models. We also deliberate that how the strengths of these language models can be leveraged in designing more engaging and more eloquent conversational agents. This paper, therefore, intends to establish whether these pre-trained models can overcome the challenges pertinent to dialogue systems, and how their architecture could be exploited in order to overcome these challenges. Open challenges in the field of dialogue systems have also been deliberated.
- North America > United States (1.00)
- Europe (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)